perm filename ACM.LE1[LET,JMC]2 blob sn#122467 filedate 1974-10-01 generic text, type T, neo UTF8
\\M0BASL30;\M1BASI30;\M2NGR40;\M3BASB30;\M4SUB;\.
\F2\CSTANFORD ARTIFICIAL INTELLIGENCE LABORATORY
\CDEPARTMENT OF COMPUTER SCIENCE
\CSTANFORD UNIVERSITY
\CSTANFORD, CALIFORNIA 94305
\F0


						

							September 30, 1974


ACM Forum
Communications of the ACM
ACM Headquarters Office
1133 Avenue of the Americas
New York, New York 10036


\J	Friedman and Hoffman use an encipherment technique in their
paper  \F1Execution  Time Requirements  for  Encipherment Programs\F0
[\F1Communications of  the  ACM\F0  17,8 (August  1974)]  that  seems
too easily broken by the \F1probable word\F0 method
for general use in time-sharing systems.
While they  don't advertise the technique as  good, someone
might  use it with regrettable  results.  The object  of this note is
not just to point out the defect, but to pose the problem in general,
to make explicit a criterion for a cipher to be \F1probable-word-proof\F0,
and to suggest a modification of their system that may meet it.
Unfortunately, their timing results probably don't apply to the
modified system.

	Their  method  involves  generating  a  pseudo-random
running key as a sequence  {\F1x\F4n\F0} of numbers  continued by the
rule \F1x\F4n+1\F1  = ax\F4n\F1 +  c (mod  p) \F0where  \F1p\F0 is  a
prime.  The \F1x\F0's are then \F1exclusive-or\F0ed with packed words
of  the text to  form the cryptogram.   In the case  in question, the
words are 60 bits long as fits the CDC 6400.
The method is immediately suggested by the known fact that \F1exclusive
or\F0ing a truly random key with the words of a message gives a
cipher that has been proved undecryptable and trying to replace the
random numbers with the output of a commonly used pseudo-random number
generator.  Unfortunately, the cipher can be attacked as follows:

	The villain  guesses  a plain  text sequence  240 bits  long.
This may not always  be possible, and how he goes about it depends on
what he already knows.  For  example, he may know a favorite  phrase,
he may suspect a quotation, or the document may be a computer program
containing  a guessed subroutine  or the  user may be  trying to keep
secret a modified version of a known program.

	The villain's computer program slides his  240 bits
along the  cryptogram a character at a time  doing \F1exclusive or\F0s
in  order  to  conjecture three successive  \F1x\F4i\F0's  totaling  180
bits - say \F1x\F4m\F1,  x\F4m+1\F0 and \F1x\F4m+2\F0. For  each such
guess, the program solves the linear equations \F1x\F4m+1\F1 = ax\F4m
\F1+ c\F0 modulo \F1p\F0 to get  a conjectured \F1a\F0 and \F1c\F0.
The conjectured
\F1a\F0   and  \F1c\F0  are   used  to   continue  the   sequence  of
\F1x\F4i\F0's, e.g. to  get \F1x\F4m+3\F0 and  \F1x\F4m+4\F0.   These
are \F1exclusive-or\F0ed with  more of the cryptogram to  get further
new conjectured  plain text.  This conjectured plain text  is tested with
trigram lists to see  if it is  plausible English, and  sequences
passing the test  are displayed to the villain at  his terminal.  When
he says he likes what he sees,  the program
uses \F1a\F0 and \F1c\F0 to continue the key backward and forward
and decipher the message.

	The  weakness of  the  cipher  described  by Friedman  and
Hoffman is  that the key sequence is continuable  once 180 bits of it
have been  guessed.  The  method can  be improved  by subjecting  the
output  of the random number  generator  to an  information  losing
transformation  in order  to get  the running  key.  Even  using only
every other bit produced by the generator would make it unobvious how
to proceed,  but since the  villain's problem would  still be linear,
mathematics might save him.  It seems better to use a  second
random number generator to  select which bits from the  output of the
first generator would go into the running key.  While this looks good
to me, I don't certify it.

	All this suggests the following criterion for  a cipher to be
\F1probable-word-proof\F0. Suppose all  the plain  text known except
for one  character.   Then it  should still  require an  unacceptable
amount of  work to  learn more  about the  missing character  than is
suggested by the known remainder of the message and the statistics of
the assumed message population.

 	Several years  ago an encipherment  system that we  hoped was
probable-word-proof was  installed as a utility program at the Stanford
Artificial  Intelligence  Laboratory  on  our  PDP-10.  Its  fate  is
instructive.   A  villain replaced  the  source program  by one  that
whenever used copied the name of the enciphered file and the initialization
key into a file in  the system area.

	Of course, no security is possible if the villain can tap your
terminal or controls the time-sharing system.  Moreover, if you run a
program that the villain can replace, no password system within the
program can save you, because the villain can replace your program by
one that runs your program interpretively and also makes a file of
whatever you type.

	Suppose, however, that you trust the part of the time-sharing
system that  allows you to examine and  modify registers in your core
image.  Then, for  the PDP-10, it is possible  to check the first  10
words of a core image, deposit two more words in specified registers,
and  run the program. The hand checked  10 words will "check sum" the
whole program  using a special  algorithm that  depends on the  first
word  entered and  compare  the result  with the  second  word before
giving control to the program.  We hope, but haven't proved, that the
villain cannot modify the  main program, in a way  independent of the
two  words  you  enter,  so  that  the  test  will  be  passed.  This
intermediate   degree   of   security   may   be   useful   in   some
circumstances.  For the benefit of PDP-10 fans, the two words are put
in registers 0 and 1, and the 10 word program is\.

S:	MOVSI 2,-200
	MOVEI 3,0
	MOVE 4,S(2)
	FMP 4,0
	ROT 3,1
	ADD 3,4
	AOBJN 2,S+2
	CAMN 3,1
	JRST GO
	HALT.

\J\F1Communications\F0 readers may prefer ALGOL, but then they must
trust the compiler.\.


	
							John McCarthy
							Computer Science Department
							Stanford University

ACM.LE1[LET,JMC]:SU-AI